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(g) TGF-beta induced gene and protein. 



(§) A new TGF-0 induced gene and protein is described. Treatment of TGF-0 growth arrested cells 
induces the production of a novel gene which encodes a 683 amino acid protein, designated BIG-H3, 
that contains four homologous repeat regions and which may represent a cell surface recognition 
molecule. This gene and protein is induced in mammalian cells, and specifically human cells, upon 
treatment with TGF-0. 
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The present invention describes a novel TGF-0 induced gene, pig-h3, and the protein encoded by this in- 
duced gene, 0IG-H3, produced in response to TGF-0 mediated growth inhibition of specific human cell lines. 

BACKGROUND OF THE INVENTION 

5 

Transforming growth factor-01 (TGF-01) is a multifunctional regulator of cell growth and differentiation. It 
is capable of causing diverse effects such as inhibition of the growth of monkey kidney cells, (Tucker, R.F., 
G.D. Shipley, H.L. Moses & R.W. Holley (1984) Science 226:705-707) inhibition of growth of several human 
cancer cell lines, (Roberts, A.B., M.A. Anzano, L.M. Wakefiled, N.S. Roches, D.F. Stern & M.B. Sporn (1985) 

10 Proc. Natl. Acad. Sci. USA 82: 119-123; Ranchalis, J.E., L.E. Gentry, Y. Agawa, S.M. Seyedin, J. McPherson, 
A. Purchio & D.R. Twardzik (1987) Biochem. Biophys. Res. Commun. 148:783-789) inhibition of mouse kera- 
tinocytes, (Coffey, R.J., N.J. Sipes, C.C. Bascum, R. Gravesdeal, C. Pennington, B.E. Weissman & H.L. Moses 
(1988) Cancer Res. 48:1596-1602; Reiss, M. & C.L. Dibble (1988 In Vitro Cell. Dev. Biol. 24:537-544) stimu- 
lation of growth of AKR-2B fibroblasts (Tucker, R.F., M.E. Olkenant, E.L. Branum & H.L. Moses (1988) Cancer 

15 Res. 43:1581-1 586) and normal rat kidney fibroblasts, (Roberts, AB., M.A. Anzano, L.C. Lamb, J.M. Smith & 
M.B. Sporn (1981) Proc. Natl. Acad. Sci. USA 78:5339-5343) stimulation of synthesis and secretion of fibro- 
nectin and collagen, (Ignotz, R.A. & J. Massague (1986) J. Biol. Chem. 261:4337-4345; Centrella, M., T.L 
McCarthy & E. Canalis (1987) J. Biol. Chem. 262:2869-2874) induction of cartilage-specif ic macromolecule 
production in muscle mesenchymal cells, (Seyedin, S.M., AY. Thompson, H. Bentz, D.M. Rosen, J. McPherson, 

20 A. Contin, N.R. Siegel, G.R Galluppi & K.A. Piez (1986) J. Biol. Chem. 261:5693-5695) and growth inhibition 
of T and B lymphocytes. (Kehrl, J.H., L.M. Wakefiled, A.B. Roberts, S. Jakeoview, M. Alvarez-Mon, R. Derynck, 
M.B. Sporn & A.S. Fauci (1986) J. Exp. Med. 163:1037-1050; Kehrl, J.H., A.B. Roberts, L.M. Wakefield, S. 
Jakoview, M.B. Sporn & A.S. Fauci (1987) J. Immunol. 137:3855-3860; Kasid, A., G.I. Bell & E.P. Director (1988) 
J. Immunol. 141:690-698; Wahl, S.M., D.A. Hunt, H.L. Wong, S. Dougherty, N. McCartney-Francis, L.M. Wahl, 

25 L. Ellingsworth, J A. Schmidt, G. Hall, A.B. Roberts & M.B. Sporn (1988) J. Immunol. 140:3026-3032) 

Recent investigations have indicated that TGF-01 is a member of a family of closely related growth- 
modulating proteins including TGF-02, (Seyedin, S.M., P.R. Segarini, D.M. Rosen, A.Y. Thompson, H. Bentz 
& J. Graycar (1987) J. Biol. Chem. 262:1946-1949; Cheifetz, S., J.A. Weatherbee, M.L.-S. Tsang, J.K. Ander- 
son, J.E. Mole, R. Lucas & J. Massague (1 987) Cell 48:409-415; Ikeda, T., M.M. Lioubin & H. Marquardt (1987) 

30 Biochemistry 26:2406-2410) TGF-03, (TenDijke, P., P. Hansen, K. Iwata, C. Pieler & J.G. Foulkes (1988) Proc. 
Natl. Acad. Sci. USA 85:471 5-471 9; Derynck, R., P. Lindquist. A Lee, D. Wen, J. Tamm, J.L. Graycar, L Rhee, 
A. J. Mason, D.A Miller, R.J. Coffey, H.L. Moses & E.Y. Chen (1988) EMBO J. 7:3737-3743; Jakowlew, S.B., 
P.J. Dillard, P. Kondaiah, M.B. Sporn & A.B. Roberts (1988) Mol. Endocrinology. 2:747-755) TGF-04, (Jakowl- 
ew, S.B., P.J. Dillard, M.B. Sporn & A.B. Roberts (1988) Mol. Endocrinology. 2:1186-1195) Mulferian inhibitory 
35 substance, (Cate, R.L., R.J. Mattaliano, C. Hession, R. Tizard, N.M. Faber, A. Cheung, E.G. Ninfa, A.Z. Frey, 
D.J. Dash, E.P. Chow, R.A. Fisher, J.M. Bertonis, G. Torres, B.P. Wallner, K.L Ramachandran, R.C. Ragin, 
T.F. Manganaro, D.T. Maclaughlin & P.K, Donahoe (1986) Cell 45:685-698) and the inhibins. (Mason, A. J., J.S. 
Hayflick, N. Ling, F. Esch, N. Ueno, S.-Y. Ying, R. Guillemin, H. Niall & P.H. Seeburg (1985) Nature 318:659- 
663) 

40 TGF-01 is a 24-kDa protein consisting of two identical disulf ide-bonded 12 kD subunits. (Assoian, R.K., 

A. Komoriya, CA. Meyers, D.M. Miller & M.B. Sporn (1983) J. Biol. Chem. 258:7155-7160; Frolik, C.A., L.L 
Dart, CA. Meyers, D.M. Miller & M.B. Sporn (1983) Proc. Natl. Acad. Sci. USA 80:3676-3680; Frolik, CA., L.M. 
Wakefiled, D.M. Smith & M.B. Sporn (1984) J. Biol. Chem. 259:10995-11000) Analyse of cDNA clones coding 
for human, (Derynck, R., J.A. Jarrett, E.Y. Chem, D.H. Eaton, J.R. Bell, R.K. Assoian. A.B. Roberts, M.B. Sporn 

45 & D.V. Goeddel (1985) Nature 316:701-705) murine, (Derynck, R., J.A. Jarrett, E.Y. Chem, & D.V. Goeddel 
(1986) J. Biol. Chem. 261:4377-4379) and simian (Sharpies, K., G.D. Plowman, T.M. Rose, D.R. Twardzik & 
A.F. Purchio (1987) DNA 6:239-244) TGF-01 indicates that this protein is synthesized as a larger 390 amino 
acid pre-pro-TGF-01 precursor; the carboxyl terminal 112 amino acid portion is then proteolytically cleaved to 
yield the TGF-01 monomer. 

so The simian TGF-01 cDNA clone has been expressed to high levels in Chinese hamster ovary (CHO) cells. 

Analysis of the proteins secreted by these cells using site-specific antipeptide antibodies, peptide mapping, 
and protein sequencing revealed that both mature and precursor forms of TGF-0 were produced and were held 
together, in part, by a complex array of disulfide bonds. (Gentry, L.E., N.R. Webb, J. Lim, AM. Brunner, J.E. 
Ranchalis, D.R. Twardzik, M.N. Lioubin, H. Marquardt & A. F. Purchio (1987) Mol. Cell Biol. 7:3418-3427; Gen- 

55 try, L.E., M.N. Lioubin, A.F. Purchio & H. Marquardt (1 988) Mol. Cell. Biol. 8:4162-41 68) Upon purification away 
from the 24kD mature rTGF-01, the 90 to 110 kD precursor complex was found to consist of three species: 
pro-TGF01, the pro-region of the TGF-p1 precursor, and mature TGF-01. (Gentry, L.E., N.R. Webb, J. Lim, 
AM. Brunner, J.E. Ranchalis, D.R. Twardzik, M.N. Lioubin, H. Marquardt & A.F. Purchio (1987) Mol. Cell Biol. 
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7:3418-3427; Gentry, L.E., M.N. Lioubin, A.F. Purchio & H. Marquardt (1988) Mol. Cell. Biol. 8:4162-4168) De- 
tection of optimal biological activity required acidification before analysis, indicating that rTGF-01 was secreted 
in a latent form. 

The pro-region of the TGF-01 precursor was fou nd to be glycosylated at three sites (Asn 82, Asn 1 36, and 
Asn 176) and the first two of these (Asn 82 and Asn 136) contain mannose-6-phosphate residues. (Brunner, 
A.M., L.E. Gentry, J.A. Cooper & A.F. Purchio (1988) Mol. Cell Biol. 8:2229-2232; Purchio, A.F., J.A. Cooper. 
A.M. Brunner, M.N. Lioubin, L.E. Gentry, K.S. Kovacina, RA Roth & H. Marquardt (1988) J. Biol. Chem. 
263: 14211-14215) In addition, the rTGF-pi precursor is capable of binding to the man nose-6- phosphate re- 
ceptor and may imply a mechanism for delivery to lysomes where proteolytic processing can occur. (Kornfeld, 
S. (1986) J. Clin. Invest. 77:1-6) 

TGF-p2 is also a 24-kD homodimer of identical disulf ide-bonded 112 amino acid subunits (Marquardt, H., 
M.N. Lioubin & T. Ikeda (1987) J. Biol. Chem. 262:12127-12131). Analysis of cDNA clones coding for human 
(Madisen, L.. N.R. Webb, T.M. Rose, H. Marquardt, T. Ikeda, D. Twardzik, S. Seyedin & A.F. Purchio (1988) 
DNA 7:1-8; DeMartin, R., B. Plaendler, R. Hoefer-Warbinek, H. Gaugitsch, M. Wrann, H. Schlusener, J.M. Sen 
fert, S. Bodmer, A. Fontana & E. Hoefer. EMBO J. 6:3673-3677) and simian (Hanks, S.K., R. Armour, J.H. Bald- 
win, F. Maldonado, J. Spiess & R.W. Holley (1988) Proc. Nati. Acad. Sci. USA 85:79-82) TGF-p2 showed that 
it, too, is synthesized as a larger precursor protein. The mature regions of TGF-01 and TGF-02 show 70 % 
homology, whereas 30 % homology occurs in the pro-region of the precursor. In the case of simian and human 
TGF-02 precursor proteins differing by a 28 amino acid insertion in the pro-region; mRNA coding for these 
two proteins is thought to occur via differential splicing (Webb, N.R., L. Madisen, T.M. Rose & A.F. Purchio 
(1988) DNA 7:493-497). 

The effects of TGF-0 are thought to be mediated by the binding to specific receptors present on the surface 
of most cells (Massague, J. et al. (1985) J. Biol. Chem. 260:2636-2645: Segarini, P.R. etal. (1989) Mol. Endo- 
crino. 3:261-272; Tucker, R.F., et aJ. (1984) Proc. NatJ. Acad. Sci. USA 81:6757-6761; Wakefield, L.M., et al. 
(1 987) J. Cell Biol. 105:965-975). Chemical crosslinking of [ 125 l]-labeled TGF-0 to cell surface components has 
identified three receptor size classes having molecule weights of 53-70 kDA (type I receptor), 80-1 20 kDa (type 
II receptor) and 250-350 kDa (type III receptor). The type I and II receptors have been implicated in signal trans- 
duction (Boyd, F.T. etal. (1989) J. Biol. Chem. 264:2272-2278: Laiho, M., etal. (1990) J. Biol. Chem. 265:18518- 
18524) while the type III receptor has been suggested to act as a storage protein (Segarini, P.R. et al. (1989) 
Mol. Endocrine 3:261-272). Little is known concerning signal transduction mechanisms which occur after re- 
ceptor-ligand interaction. 

The pleiotrophic effects of TGF-0 may be due to its ability to affect the transcription of other genes. TGF- 
0 has been shown to induce fbs, myc and sis in AKR-2B cells (Leof, E.B., et al. (1986) Proc. Natl. Acad. Sci. 
USA 83 : 1 453- 1 4 58) : 1 453-1 458) enhance expression ofc-jun B in A549 cells (Pertovaara, L., et al. (1989) Mo- 
lecular and Cellular Biology 9:1255-1264), increase the mRNA for matrix proteins (Penttinen, R.P., etal. (1988) 
Proc. Natl. Acad. Sci. USA 85: 1105-1110), IL-6 (Elias, J.A., etal. (1991) J. Immunol. 146:3437-3446) and EGF- 
receptors (Thompson, K.L. et al. (1988) J. Biol. Chem. 263:19519-19528) and decrease expression of PDGF 
receptor a subunits (Battegay, E. J., et al. (1990) Cell 63: 515-524). It alters the pattern of integrin expression 
in osteosarcoma cells (Heino, J., etal. (1989) J. Biol. Chem. 264:21806-21813) and decreases the express of 
c-myc in keratinocytes (Coffey, R.J. et al. (1988b) Cancer Res. 48:1596-1602). TGF-p induces expression of 
ll-1p, TNF-a, PDGFand bFGF in human peripheral blood monocytes (McCartney-Francis, N., etal. (1991) DNA 
and Cell Biology 10:293-300). 

SUMMARY OF THE INVENTION 

The present invention is directed to a novel protein and gene induced by transforming growth factor beta 
(TGF-p) in mammalian cells. 

In order to identify novel genes that encode protein products which might be involved in mediating some 
of the effects of TGF-p, a cDNA library was constructed from mRNA isolated from mammalian cells, such as 
human lung adenocarcinoma cells, which had been growth arrested by exposure to TGF-p. Several clones 
were isolated. One clone, termed TGF-p induced gene-h3 (Pig-h3) encoded a novel protein, PIG-H3, containing 
683 amino acid residues. 

In the present invention a TGF-p induced protein is produced in growth arrested mammalian cells and pre- 
ferably contains about 683 amino acid residues. The TGF-p induced protein preferably contains four homolo- 
gous repeat regions of approximately 140 amino acids each and has an Arg-Gly-Asp sequence near its carboxy 
terminus. Treatment of mammalian cells such as human adenocarcinoma cells and embryonic mesenchymal 
cells with TGF-p produces a 10 to 20 fold increase in these cells of a 3.4 kb RNA construct that encodes a 
protein of this invention. 
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The present invention is further directed to the protein plG-H3 which contains a 683 amino acid residue 
sequence corresponding to Sequence ID Number 2 and which contains an Arg-Gly-Asp at residues 642-644 
of the amino acid sequence depicted in FIGURE 5. PIG-H3 contains four homologous repeat regions that share 
at least 16% homology with each other. 
5 The present invention is also directed to a nucleotide sequence that encodes a gene whose expression 

is strongly induced by TGF-p. The nucleotide sequence of the present invention can induce the production of 
a RNA transcript of about 3.4 kb t and preferably encodes the expression of 0IG-H3. 

DESCRIPTION OF THE FIGURES 

10 

In the drawings: 

FIGURE 1 Plustrates the expression piG-H3 in A549 cells after treatment with TGF-01 and TGF-02. Con- 
fluent dishes of A549 cells grown in DM EM + 10% FBS were split 1:10. Twenty hours later, they were treated 
with 20 ng/ml rTGF-p1 (Aand C) or rTGF-p2 [D] for 72 hours. Total RNA was isolated and 25 \ig was fractionated 
15 on an agarose-formaldehyde gel and analyzed by Northern blotting using pPJ-labeled PIG-H3 probe. Lane 
1, RNA from untreated cells; lane 2, RNA from TGF-p treated cells. Exposure time for Aand D, 10 hours; ex- 
posure time for C, 3 days. Panel B is a photograph of the gel in panel Astain with methylene blue. Bands were 
quantitated using a Molecular Dynamics Phosphoimager. 

FIGURE 2 illustrates the time course for induction of plG-H3 mRNA by TGF-p1 . Confluent dishes of A549 
20 cells were split 1:10. Twenty hours later, they were treated with TGF-pi (20 ng/ml) for 6 hours (Jane 2), 24 hours 
(lane 3), 48 hours (lane 4), 72 hours (lane 5), or 96 hours (lane 6): RNA was isolated and hybridized to p2-P]- 
labeled pig-h3 probe. Lane 1 contains RNA from untreated cells. 

FIGURE 3 illustrates the removal of TGF-01 from the culture media of A549 cells leads to a decrease in 
synthesis of pig-h3 RNA. A549 cells were treated with TGF-pi (20 ng/ml) for 3 days. Cells were then washed 
25 and grown in complete medium without TGF-pi for 24 hours (lane 2), 48 hours (lane 3), 72 hours (lane 4) or 
3 weeks (lane 5). RNA was extracted and analyzed by Northern blotting using P-P]-labeled pig-h3 probe. Lane 
1 contains RNA from A549 cells treated for 3 days with TGF-pi. 

FIGURE 4 illustrates the determination of pig-h3 mRNA half-life. A549 cells were treated with TGF-p (20 
ng/ml) for 48 hours. Actinomycin D (10 ng/ml) was then added and RNA was extracted at the indicated times 
30 and analyzed by Northern blotting with P-P]-labeled pig-h3 probe. Bands were quantitated using a Molecular 
Dynamics Phosphoimager and are plotted as percentage of cpm remaining in the 3.4 kb pig-h3 RNA band. 

O O f untreated cells; O O, TGF-p treated cells. 

FIGURE 5 illustrates the nucleotide and deduced amino acid sequence of plG-H3. Sequencing was per- 
formed as described (Sanger, F., et al. (1977) Proc. Natl. Acad. Sci. USA 74:5463-5467) and two dependent 
35 clones were sequenced for each region. The signal sequence is overlined and arrows mark predicted cleavage 
sites: the RGD sequence is boxed. Repeats 1 through 4 are bracketed and a polyadenylation signal at nucleo- 
tide 2625 is indicated (horizontal bracket). 

FIGURE 6A illustrates the 4 homologous domains of plG-H3 compared with the third repeats from droso- 
phila fasciclin-l (DrF-3), grasshopper fasciclin-l (GrF-3), and the carboxy terminal half of the Mycobacterium 
40 bovus protein Mpb70. Boxed amino acids are identical to at least 2 others at that same position. 

FIGURE 6B illustrates the 4 repeats of PIG-H3 directly compared. Boxed amino acids are identical with at 
least 1 other at that same position. Multiple alignments were generated using the program Pileup of UW/GCG 
software. 

45 DESCRIPTION OF PREFERRED EMBODIMENTS 

The present invention is directed to a nucleotide sequence and a protein that is induced in mammalian 
cells in response to TGF-p. 

The arrest of the growth of specific mammalian cells, such as human lung adenocarcinoma cells, by treat- 
50 ment with TGF-p resulted in the increased induction of a novel gene product. TGF-p refers to a family of highly- 
related dimeric proteins which are known to regulate the growth and differentiation of many cell type. As used 
herein, the term TGF-p B refers to any member of the family of transforming growth factor beta which include 
TGF-p1, TGF-p2, TGF-p3, TGF-p4, TGF-P5 as well as the TGF-p1/p2 hybrid molecules, designated 5-p. 

TGF-p is known to regulate the transcription of several genes, such as the genes encoding c-myc, c-sis, 
55 and the platelet-derived growth factor receptor. In the present invention, an attempt was made to identify novel 
genes whose protein products could be involved in mediating some of the pleiotropic effects of TGF-p. As a 
result of the present invention a new gene product has been identified in mammalian cells that have been 
growth arrested by TGF-p. 
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Ail amino acid residues identified herein are in the natural of L-conf iguration. In keeping with standard poly- 
peptide nomenclature, abbreviations for amino acid residues are as follows: 



10 



15 



20 



25 



30 



35 



Alanine 
Arginine 
Asparagine 
Aspartic acid 

Aspartic acid or Asparagine 

Cysteine 

Glutamine 

Glutamic acid 

Glycine 

Glutamic acid or Glutamine 

Histidine 

Isoleucine 

Leucine 

Lysine 

Methionine 

Phenylalanine 

Proline 

Serine 

Threonine 

Tryptophan 

Tyrosine 

Valine 





Ala 


A 


Arg 


R 


Asn 


N 


Asp 


D 


Asx 


B 


Cys 


C 


Gin 


Q 


Glu 


E 


Gly 


G 


GIx 


Z 


His 


H 


lie 


I 


Leu 


L 


Lys 


K 


Met 


M 


Phe 


F 


Pro 


P 


Ser 


S 


Thr 


T 


Tip 


W 


Tyr 


Y 


Val 


V 



In the present invention, a substantially pure protein is isolated. This protein is produced in a mammalian 
cell in response to contacting the cells with sufficient TGF-p to arrest the growth of the mammalian cell. 
As used herein the term "mammalian cell 0 refers to cells derived from a mammal, or mammalian tumor, 
40 including human cells such as human lung adenocarcinoma cells, human embryonic palatal mesenchymal cells 
and human prostatic adenocarcinoma cells. 

As used herein the term "induced" refers to the stimulation, promotion and/or amplification of transcription 
or translation in a target cell. In a preferred embodiment of the present invention either RNA or protein produc- 
tion can be induced by TGF- p in a mammalian cell. 
45 In a particularly preferred embodiment, TGF- p induced protein of the present invention has an amino acid 

residue sequence of about 683 amino acid residues. 

When mammalian cells, such as human lung adenocarcinoma are treated with TGF-p 1, growth inhibition 
of the cells resulted. A cONA library was constructed and screened in order to isolate a clone which displayed 
increased hybridization to a cDNA probe prepared from TGF-pi treated cells. One clone was isolated and des- 
50 ignated pig-h3. 

It was found that TGF-pi and TGF-p2 each induced pig-h3 in cells. The induction was reversible and re- 
sulted from an increase in transcription. Analysis of the induced pig-h3 DNIA revealed an open reading frame 
that encoded a novel 683 amino acid protein, plG-H3, which contained a secretory leader signal sequence 
and an Arg- Gly- Asp sequence. PIG-H3 contained four internal repeat regions. These repeat regions display 
55 limited homology with short regions of grasshopper and drosophila fasciclin-l and Mpb70 from myco bacterium 
bovus. Fasciclin-l is a surface recognition glycoprotein expressed on subsets of axon bundles in insect em- 
bryos. Fasciclin-l contains four homologous 150 amino acid domains and has approximately 40% homology 
between grasshopper and drosophila (Zimm et al. (1 988) Cell 53:577-583). It is thus considered in this inven- 
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tion that pig-h3 may encode a novel surface recognition protein. As such, and as proposed for fasciclin-l, the 
four homologous repeats could suggest a tetrameric structure with two binding sites, one at each intrachain 
dimer. This structure allows one PIG-H3 molecule to bind to a surface protein on two different cells. Additionally, 
the Arg-GIy-Asp sequence in PIG-H3. which is not present in fasciclin-l, may allow for interactions with various 
5 integrins. 

PIG-H3 represents a new gene product induced by TGF-p and may illumimate the pleiotropic effects of 
TGF-p as, partly, being due to its ability to regulate gene transcription. It has recently been shown that growth 
inhibition by TGF-p is linked to inhibition of phosphorylation of pRB, the product of the retinoblastoma suscep- 
tibility gene (Pietenpol, et al. (1990) Cell §1:777-75; Laiko et al. (1990) Cell 62:175-185). If piG-H3 is involved 
10 in cell surface recognition, it may participate in cell-cell communication and in the transmission of intracellular 
signals that are involved in negative growth control. 

The present invention is further described by the following Examples which are intended to be illustrative 
and not limiting. 

15 EXAMPLE 1 

Identification of pig-h3 and Induction By TGF-p 

Several human ceil lines were cultured and used in these studies. A549 and H2981 (both human lung ade- 

20 nocarcinoma) cells, and the human breast carcinoma cell lines (MDA453, MDA468 and 293) were grown in 
Dulbecco's Modified Eagle's medium (DMEM) plus 10 % fetal bovine serum (FBS). The human breast carci- 
noma line MCF-7 was grown in DMEM + 10% FBS containing 60 ng/ml of insulin, and human prostatic ade- 
nocarcinoma cells (PC-3) were grown in a mixture of DMEM and Hank's F-12 medium (1:1) containing 10% 
FBS. Several routine and general methological procedures were utilized and are described in the articles cited 

25 herein, all of which are incorporated by reference. 

Confluent dishes of A549 cells were split 1:10. Twenty hours later, they were treated with 20 ng/ml recom- 
binant TGF-p1 in complete medium for 72 hours. This resulted in an 80-90 % inhibition of DNA synthesis. A549 
cells which were not treated with TGF-p1 were used as controls. Poly (A) containing RNA was extracted and 
a cDNA library was constructed in X gt-10 by the method described in Webb et al. (1987) DNA 6:71-78, which 

30 is incorporated herein by reference. Duplicate filters were screened with P-PJ-labeled cDNAfrom treated and 
untreated cells. Plaques showing increased hybridization to the treated probe were purified through the tertiary 
stage and the cDNA inserts were subcloned into pEMBL, as described in Dente et al. (1983) Nucleic Acids 
Res. 11; 1645-1654. Several clones were isolated and one clone, ppig-h3a, was chosen for further study. 
DNA sequence analysis of ppig-h3 detects a major transcript of 3.4 kb which is induced about 10-fold in 

35 A549 cells after a 72 hours with TGF-p1 (FIGURE 1 A). A longer exposure of FIGURE 1 A demonstrates that 
the pig-h3 transcript can be detected at low levels in untreated cells (FIGURE 1C) pig-h3 is also induced by 
TGF-p2, as shown in FIGURE 1D, and thus appears to be a TGF-p induced gene. A time course induction is 
presented in FIGURE 2 and indicated that maximal stimulation of pig-h3 by TGF-p1 in A549 cells occurred after 
48 hours of TGF-p 1 treatment (a 20-fold increase above untreated cells). 

40 Noticeable morphological changes of A459 cells occur upon TGF-p treatment The cells appear larger, 

more spread out and assume a flattened morphology. These phenotypic changes are reversed upon removal 
of TGF-p and regrowth of the cells in complete media. 

Removal of TGF-01 from the culture medium resulted in a decrease in the expression of pig-h3 to the levels 
found in untreated cells (FIGURE 3) This finding is consistent with the reversible growth inhibition of those 

45 cells. 

Total RNA was extracted from both untreated cells and from cells treated with TGF-p, as described above. 
The RNA was fractionated on a 1 %, agarose-form aldehyde gel, according to the method of Lehrach et al. 
(1977) Biochemistry 16:4743-4751, transferred to a nylon membrane (Hybond N, Amersham) and hybridized 
to P-PJ-labeled probe, according to the method described in Madisen et al. (1988) DNA 7:1 -8. The bands were 

so quantitated using a Molecular Dynamics Phosphoimager. 

The increase in pig-h3 RNA could be due to either an increase in transcription or an increase in half-life. 
The half-life of the pig-h3 transcripts was determined in untreated and TGF-pi treated A549 cells. The results 
shown in Figure 4, illustrate that the half-life for pig-h3 RNA in untreated cells was about 5 hours, and is only 
slightly increased to 7 hours in TGF-p1 treated, transcriptionally inhibited (actinomycin D-treated) cells. The 

55 major increase in pig-h3 RNA thus appears to be due to an increase in transcription, rather than an increase 
in half- life. As shown in Figure 2, the kinetics of pig-h3 message accumulation implies a half-life of 7-11 hours, 
which is the same range observed in the actinomycin D studies. This suggests that message stability is not 
grossly altered by actinomycin D in these studies. 
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Several human normal and cancer cell lines were examined for Induction of pig-h3. TGF-01 treatment of 
HEPM (human embryonic palatal mesenchymal) cells, H2981 cells resulted in an increase in 0ig-h3 mRNA. 
pig-h3 message was not induced by TGF-01 in 293 cells nor in the breast cancer cell lines MCF-7, MDA453 
or MDA468. The fact that 0rg-h3 is not induced in ail cell types is not a unique finding, as the induction of other 
5 genes by TGF-0 have been known to vary in different cell lines. For example, c-myc is reported to be stimulated 
in AKR-2B fibroblasts (Leof et ai. (1986) Proa Natl. Acad. Sci. USA 83:1453-1458), but down regulated in ker- 
atonicytes (Coffey et al. (1 988) Cancer Res. 48:1 596-1 602). 

EXAMPLE 2 

10 

Sequence Analysis 

DNA sequence analysis was performed by the method of Sanger et al. (1977) Proc. Natl. Acad. Sci. USA 
74:5463-54679. 

is Nucleotide sequence analysis of p0ig-h3a revealed that it contained a partial open reading frame. The 

cDNA library was therefore rescreened with P-P]-labeled 0ig-h3a probe until several overlapping clones en- 
coding the entire open reading frame were obtained. The nucleotide and deduced amino acid sequence of 0IG- 
H3 is shown in FIGURE 5 and is described in Sequence I.D. Number 1 and 2. The cDNA contains a single 
open reading frame encoding a 683 amino acid protein, 0IG-H3. 0IG-H3 contains an amino terminal signal pep- 

20 tide and an RGD sequence located at the carboxy terminus (residues 642-644). This motif has previously been 
shown to serve as a iigand recognition sequence for severai integrins (Ruoslahti, E. (1989) J. Biol Chem. 
264:13369-13371). There are no predicted sites of N-linked glycosylation. A polyadenylation signal is present 
at nucleotide residue 2624. 

ATfasta search of the Genebank and EMBL databases with the 0ig-h3 open reading frame indicated that 

25 the protein was unique. Short regions with homology to grasshopper and drosophilafasciclin-l and Mpb70from 
Mycobacterium bovus were identified. FIGURE 6/A shows multiple alignments of regions from these proteins. 

Upon dot matrix analysis of 0IG-H3 four homologous domains of approximately 140 amino acids were re- 
vealed. A comparison of these repeats is shown in FIGURE 6B and illustrate interdomain homologies ranging 
from 31 % (between domains 2 and 4) to 16% (between domains 1 and 3), with domain 3 the most divergent 

30 These interdomain homologies are similar to those found in fasciclin-l, wherein repeat 2 appears to be the most 
divergent The domains of 0IG-H3 and fasciclin-l share 3 highly conserved amino acid stretches. One stretch 
contains 9 of 1 0 amino acids conserved at the amino end (TXFAPSNEA W). A second stretch has 6 of 8 
amino acids conserved about 30 residues from the amino end (R X 1 1 N X H I); and a third region near the 
carboxy end has 12 of 1 6 amino acids conserved (ATNGVVHXIDXVtXXP). These comparisons are 

35 illustrated in FIGURE 6A. 

Mpb70 in the major secreted protein from Mycobacterium bovus, the causal agent of bovine tuberculosis. 
Mpb70 occurs as a dimer of a 163 amino acid monomer with 33 % homology to the 0IG-H3 domains in the 
carboxy terminal 97 amino acids. The amino terminal 66 amino acids carry mycobacterium specific epitopes 
(Redford et al. (1 990) J. of Gen. Microbiol. 136:265-272). 

40 The foregoing description and Examples are intended as illustrative of the present invention, but not as 

limiting. Numerous variations and modifications may be effected without departing from the true spirit and 
scope of the present invention. 
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SEQUENCE LISTING 
GENERAL INFORMATION 

(i) APPLICANT: 

(A) NAME: BRISTOL-MYERS SQUIBB COMPANY 

(B) STREET: 34 5 PARK AVENUE 

(C) CITY: NEW YORK 

(D) STATE: NEW YORK 

(E) COUNTRY: USA 

(F) POSTAL CODE: 10154 

(ii) TITLE OF INVENTION: TGF-BETA INDUCED GENE AND 
PROTEIN 

(iii) NUMBER OF SEQUENCES: 2 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0 , Version 
#1.25 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2691 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo Sapiens 

(F) TISSUE TYPE: LUNG 

(G) CELL TYPE: ADENOCARCINOMA 

(H) CELL LINE: A549 
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15 



20 



30 



35 



40 



45 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCTTGCCCGT CGGTCGCTAG CTCGCTCGGT GCGCGTCGTC CCGCTCCATG GCGCTCTTCG 60 

TCCGGCTGCT GGCTCTCGCC CTGGCTCTGG CCCTGGGCCC CGCCGCGACC CTGGCGGGTC 120 

CCGCCAAGTC GCCCTACCAG CTGGTGCTGC AGCACAGCAG GCTCCGGGGC CGCCAGCACG 180 

GCCCCAACGT GTGTGCTGTG CAGAAGGTTA TTGCCACTAA TAGGAAGTAC TTCACCAACT 240 

GCAAGCAGTG GTACCAAAGG AAAATCTGTG GCAAATCAAC ACTCATCAGC TACGAGTGCT 300 

GTCCTGGATA TGAAAAGCTC CCTGGGGAGA AGGGCTGTCC AGCAGCCCTA CCACTCTCAA 360 

ACCTTTACGA GACCCTGGGA GTCGTTGGAT CCACCACCAC TCAGCTGTAC ACGGACCGCA 420 

CGGAGAAGCT GAGGCCTGAG ATGGAGGGGC CCGGCAGCTT CACCATCTTC GCCCCTAGCA 480 

ACGAGGCCTG GGCCTCCTTG CCAGCTGAAG TGCTGGACTC CCTGGTCAGC AATGTCAACA 540 

TTGAGCTGCT CAATGCCCTC CGCTACCATA TGGTGGGCAG GCGAGTCCTG ACTGATGAGC 600 

TGAAACACGG CATGACCCTC ACCTCTATGT ACCAGAATTC CAACATCCAG ATCCACCACT 660 

ATCCTAATGG GATTGTAACT GTGAACTGTG CCCGGCTCCT GAAAGCCGAC CACCATGCAA 720 

CCAACGGGGT GGTGCACCTC ATCGATAAGG TCATCTCCAC CATCACCAAC AACATCCAGC 780 

AGATCATTGA GATCGAGGAC ACCTTTGAGA CCCTTCGGGC TGCTGTGGCT GCATCAGGGC 840 

TCAACACGAT GCTTGAAGGT AACGGCCAGT ACACGCTTTT GGCCCCGACC AATGAGGCCT 900 

TCGAGAAGAT CCCTAGTGAG ACTTTGAACC GTATCCTGOG CGACCCAGAA GCCCTGACAG 960 

ACCTGCTGAA CAACCACATC TTGAAGTCAG CTATGTGTGC TGAAGCCATC GTTGCGGGGC 1020 

TGTCTGTAGA GACCCTGGAG GGCACGACAC TGGAGGTGGG CTGCAGCGGG GACATGCTCA 108O 

CTATCAACGG GAAGGCGATC ATCTCCAATA AAGACATCCT AGCCACCAAC GGGGTGATCC 1140 

ACTACATTGA TGAGCTACTC ATCCCAGACT CAGCCAAGAC ACTATTTGAA TTGGCTGCAG 1200 

AGTCTGATGT GTCCACAGCC ATTGACCTTT TCAGACAAGC CCTCCCTCGGC AATCATCTCT 1260 

CTGGAAGTGA GCGGTTGACC CTCCTGGCTC CCCTGAATTC TGTATTCAAA GATGGAACCC 1320 

CTCCAATTGA TGCCCATACA AGGAATTTGC TTCGGAACCA CATAATTAAA G ACCAG CTGG 1380 

CCTCTAAGTA TCTGTACCAT GGACAGACCC TGGAAACTCT GGGCGGCAAA AAACTGAGAG 1440 

TTTTTGTTTA TCGTAATAGC CTCTGCATTG AGAACAGCTG CATOGCGGCC CACGACAAGA 1500 

GGGGGAGGTA CGGGACCCTG TTCACGATGG ACCGGGTGCT GACCCCCCCA ATGGGGACTG 1560 

TCATGCATGT CCTGAAGGGA GACAATCGCT TTAGCATGCT GGTAGCTGCC ATCCAGTCTG 1620 

CAGGACTGAC GOAGACCCTC AACCGGGAAG GACTCTACAC AGTCTTTGCT CCCACAAATG 1680 

AAG CCTTCCG AGCCCTGCCA CCAAGAGAAC GGAGCAGACT CTTGGGAGAT GCCAAGGAAC 1740 

TTGCCAACAT CCTGAAATAC CACATTGGTG ATGAAATCCT GGTTAGCGGA GGCATCGGGG 1800 

CCCTGGTGCG GCTAAAGTCT CTCCAAGGTG ACAAGCTGGA AGTCAGCTTG AAAAACAATG 1860 

TGGTGAGTGT CAACAAGGAG CCTGTTGCCG AGCCTGACAT CATGGCCACA AATGGCGTGG 1920 

TCCATGTCAT CACCAATGTT CTGCAGCCTC CAGCCAACAG ACCTCAGGAA AGAGGGGATG 1980 
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AACTTCCACA CTCTGCGCTT GAGATCTTCA AACAAGCATC AGCGTTTTCC AGGGCTTCCC 2040 

AGAGGTCTGT GCGACTAGCC CCTGTCTATC AAAAGTTATT AGAGAGGATG AAGCATTAGC 2100 

TTGAAGCACT ACAGGAGGAA TGCACCAOGG CAGCTCTCCG CCAATTTCTC TCAGATTTCC 2160 

ACAGAGACTG TTTGAATGTT TTCAAAACCA AGTATCACAC TTTAATGTAC ATGGGCCGCA 2220 

CCATAATGAG ATGTGAGCCT TGTGCATGTG GGGGAGGAGG GAGAGAGATG TACTTTTTAA 2280 

ATCATGTTCC CCCTAAACAT GGCTGTTAAC CCACTGCATG CAGAAACTTG GATGTCACTG 2340 

CCTGACATTC ACTTCCAGAG AGGACCTATC CCAAATGTGG AATTGACTGC CTATGCCAAG 2400 

TCCCTGGAAA AGGAGCTTCA GTATTGTG GG GCTCATAAAA CATGAATCAA GCAATCCAGC 2460 

CTCATGGGAA GTCCTGGCAC AGTTTTTGTA AAGCCCTTGC ACAGCTGGAG AAATGGCATC 2520 

ATTATAAGCT ATGAGTTGAA ATGTTCTGTC AAATGTGTCT CACATCTACA CGTGGCTTGG 2580 

AGGCTTTTAT GGGGCCCTGT C C AGG TAG AA AAGAAATGGT ATGTAGAGCT TAGATTTCCC 2640 

20 TATTGTGACA GAGCCATGGT GTGTTTGTAA TAATAAAACC AAAGAAACAT A 2691 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 
25 < A > LENGTH: 683 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: YES 

<v) FRAGMENT TYPE: internal 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(F) TISSUE TYPE: LUNG 
35 (G) CELL TYPE: ADENOCARCINOMA 

(H) CELL LINE: A549 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

40 Met Leu Phe Val Arg Leu Leu Ala Leu Ala Leu Ala Leu Ala Leu 

1 5 10 15 

Gly Pro Ala Ala Thr Leu Ala Gly Pro Ala Lys Ser Pro Tyr Gin Leu 
20 25 30 

Val Leu Gin His Ser Arg Leu Arg Gly Arg Gin His Gly Pro Asn Val 
45 35 40 45 

50 55 60 

Cys Lys Gin Trp Tyr Gin Arg Lys lie Cys Gly Lys Ser Thr Val He 
65 70 75 BO 

Ser Tyr Glu Cys Cys Pro Gly Tyr Glu Lys Val Pro Gly Glu Lys Gly 
85 90 95 

Cys Pro Ala Ala Leu Pro Leu Ser Asn Leu Tyr Glu Thr Leu Gly Val 
0 100 105 * no 

55 
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Val Gly Ser Thr Thr Thr Gin Leu Tyr Thr Aap Arg Thr Glu Lye Leu 
115 120 125 

Arg Pro Glu Met Glu Gly Pro Gly Ser Phe Thr He Phe Ala Pro Ser 
130 135 140 

Aan Glu Ala Trp Ala Ser Leu Pro Ala Glu Val Leu Aap Ser Leu Val 
14 5 150 155 160 

Ser Aan Val Aan lie Glu Leu Leu Aan Ala Leu Arg Tyr Hia Met Val 

165 170 175 

Cya Ala Val Gin Lya Val lie Gly Thr Aan Arg Lya Tyr Phe Thr Aan 

Gly Arg Arg Val Leu Thr Aap Glu Leu Lya His Gly Met Thr Leu Thr 
180 185 190 

Ser Met Tyr Gin Aan Ser Aan He Gin lie Hia Hia Tyr Pro Aan Gly 
195 200 205 

lie Val Thr Val Aan Cya Ala Arg Leu Leu Lya Ala Aap Hia Hia Ala 
210 215 220 

Thr Aan Gly Val Val Hia Leu He Aap Lya Val He Ser Thr He Thr 
225 230 235 240 

Aan Aan He Gin Gin He He Glu He Glu Aap Thr Phe Glu Thr Leu 
245 250 255 

Arg Ala Ala Val Ala Ala Ser Gly Leu Aan Thr Met Leu Glu Gly Aan 
260 265 270 

Gly Gin Tyr Thr Leu Leu Ala Pro Thr Asn Glu Ala Phe Glu Lya He 
275 280 285 

Pro Ser Glu Thr Leu Aan Arg He Leu Gly Aap Pro Glu Ala Leu Arg 
290 295 300 

Aap Leu Leu Aan Aan Hia He Leu Lya Ser Ala Met Cya Ala Glu Ala 
305 310 315 320 

He Val Ala Gly Leu Ser Val Glu Thr Leu Glu Gly Thr Thr Leu Glu 
325 330 " 335 

Val Gly Cya Ser Gly Aap Met Leu Thr He Aan Gly Lya Ala He He 
340 345 350 

Ser Aan Lys Asp He Leu Ala Thr Asn Gly Val He His Tyr He Aap 
355 360 365 

Glu Leu Leu He Pro Asp Ser Ala Lys Thr Leu Phe Glu Leu Ala Ala 
370 375 380 

Glu Ser Asp Val Ser Thr Ala He Asp Leu Phe Arg Gin Ala Gly Leu 
385 390 395 400 

Gly Asn His Leu Ser Gly Ser Glu Arg Leu Thr Leu Leu Ala Pro Leu 
405 410 415 

Asn Ser Val Phe Lys Asp Gly Thr Pro Pro He Asp Ala Hie Thr Arg 
«0 425 430 

Asn Leu Leu Arg Asn His He He Lys Asp Gin Leu Ala Ser Lys Tvr 
435 440 445 

Leu Tyr Hie Gly Gin Thr Leu Glu Thr Leu Gly Gly Lys Lye Leu Arq 
450 455 460 
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Val Phe val Tyr Arg Asn Ser Leu Cya lie Glu Asn Ser Cye lie Ala 
465 470 475 480 



Ala His Asp Lys Arg Cly Arg Tyr Gly Thr Leu Phe Thr Met Asp Arg 
485 490 495 

Val Leu Thr Pro Pro Met Gly Thr Val Met Asp Val Leu Lys Gly Asd 
500 505 .510 

Asn Arg Phe Ser Met Leu Val Ala Ala lie Gin Ser Ala Gly Leu Thr 
515 520 525 

Glu Thr Leu Asn Arg Glu Gly Val Tyr Thr Val Phe Ala Pro Thr Asn 
530 535 540 

Glu Ala Phe Arg Ala Leu Pro Pro Arg Glu Arg Ser Arg Leu Leu Gly 
545 550 555 560 

Asp Ala Lys Glu Leu Ala Asn lie Leu Lys Tyr His lie Gly Asp Glu 
565 570 575 

lie Leu Val Ser Gly Gly lie Gly Ala Leu Val Arg Leu Lys Ser Leu 
580 585 590 

Gin Gly Asp Lys Leu Glu Val Ser Leu Lys Asn Asn Val Val Ser Val 
595 600 605 

Asn Lys Glu Pro Val Ala Glu Pro Asp He Met Ala Thr Asn Gly Val 
610 615 620 

Val His Val He Thr Asn Val Leu Gin Pro Pro Ala Asn Aro Pro Gin 
625 630 635 640 

Glu Arg Gly Asp Glu Leu Ala Asp Ser Ala Leu Glu He Phe Lys Gin 
645 650 655 

Ala Ser Ala Phe Ser Arg Ala Ser Gin Arg Ser Val Arg Leu Ala Pro 
660 665 670 

Val Tyr Gin Lys Leu Leu Glu Arg Met Lys His 
675 680 



Claims 

1. A substantially pure protein comprising a protein having a sequence of about 683 amino acid residues in 
length and substantially corresponding to Sequence I.D. 2, wherein said protein is induced by contacting 
mammalian cells with transforming growth factor beta to growth arrest said cells. 

2. The protein according to Claim 1 , wherein said transforming growth factor beta is selected from the group 
consisting of TGF-p1, TGF-02, TGF-03. a TGF-01/02 hybrid molecule and fragments thereof. 

3. The protein according to Claim 1, wherein said protein is 0IG-H3. 

4. The protein according to Claim 1 , wherein said protein contains four homologous repeating regions. 

5. The protein according to Claim 1, wherein said mammalian cells are human cells. 

6. The protein according to Claim 1, wherein said human cells are selected from the group consisting of lung 
adenocarcinoma cells, embryonic palatal mesenchymal cells and prostatic adenocarcinoma cells. 

7. 0IG-H3, a substantially pure protein comprising an amino acid residue sequence of about 683 amino acid 
residues substantially corresponding to Sequence I.D. 2 and FIGURE 5, wherein said protein contains an 
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Arg-Gly-Asp sequence in the carboxy terminal amino acids corresponding to amino acid residues 642- 
644 in FIGURE 5. 

0IG-H3 according to Claim 7, wherein said protein contains four homologous repeating regions as depicted 
in FIGURE 6. 

PIG-H3 according to Claim 8, wherein said repeating regions have a homology of at least 16% with each 
other. 

A substantially pure nucleotide sequence encoding a gene whose expression is induced by contacting 
mammalian cells with transforming growth factor beta, comprising a nucleotide sequence substantially 
corresponding to Sequence I.D. 1 and FIGURE 5. 

The nucleotide sequence according to Claim 10, wherein said transforming growth factor beta induces 
the production of a 3.4 kilobase RNA transcript from said gene. 

The nucleotide sequence according to Claim 10, wherein said transforming growth factor beta is selected 
from the group consisting of TGF-p1 , TGF-02, TGF-p3, a TGF-p1/p2 hybrid molecule and fragments there- 
of. 

The nucleotide sequence according to Claim 10, wherein said gene encodes the expression of plG-H3. 

A process for the production of a protein according to any one of Claims 1 to 9, comprising the steps of: 

i) inserting the nucleotide sequence of any one of Claims 10 to 13 into an expression system; 

ii) inducing the expression system to express the nucleotide sequence to form a protein product; and 

iii) isolating the protein product 

A process for identifying a protein whose expression is induced by TGF-p comprising the steps of: 

i) growing a cell in the presence of TGF-p; 

ii) constructing a cDNA library from the cell; 

iii) comparing the cDNA library with another cDNA library constructed from a cell grown in the absence 
of TGF-p and identifying the TGF-p-specific clones; and 

iv) further characterising the TGF-p-specific clones to identify the proteins thereby encoded. 
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Figure 2 
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-47 GCTTGCCCGTCGGTCGCTAGCTCGCTCGGTGCGCGTCGTCCCGCTCC -1 



Met Ala Leu Phe Val Arg Leu Leu Ala Leu Ala Leu Ala Leu Ala Leu 
ATG GCG CTC TTC GTG CGG CTG CTG GCT CTC GCC CTG GCT CTG GCC CTG 48 

20 I 

Gly Pro Ala Ala Thr Leu Ala Gly Pro Ala Lys Ser Pro Tyr Gin Leu 

GGC CCC GCC GCG ACC CTG GCG GGT CCC GCC AAG TCG CCC TAC GAG CTG 96 

35 45 
Val Leu Gin His Ser Arg Leu Arg Gly Arg Gin His Gly Pro Asn Val 
GTG CTG CAG CAC AGC AGG CTC CGG GGC CGC CAG CAC GGC CCC AAC GTG 144 

60 

Cys Ala Val Gin Lys Val lie Gly Thr Asn Arg Lys Tyr Phe Thr Asn 

TGT GCT GTG CAG AAG GTT ATT GGC ACT AAT AGG AAG TAC TTC ACC AAC 192 

70 

Cys Lys Gin Trp' Tyr Gin Arg Lys lie Cys Gly Lys Ser Thr Val lie 

TGC AAG CAG TGG TAC CAA AGG AAA ATC TGT GGC AAA TCA ACA GTC ATC 240 

85 95 
Ser Tyr Glu Cys Cys Pro Gly Tyr Glu Lys Val Pro Gly Glu Lys Gly 
AGC TAC GAG TGC TGT CCT GGA TAT GAA AAG GTC CCT GGG GAG AAG GGC 288 

110 

Cys Pro Ala Ala Leu Pro Leu Ser Asn Leu Tyr Glu Thr Leu Gly Val 

TGT CCA GGA GCC CTA CCA CTC TCA AAC CTT TAC GAG ACC CTG GGA GTC 336 

120 

Val Gly Ser Thr Thr Thr Gin Leu Tyr Thr Asp Arg Thr Glu Lys Leu 

GTT GGA TCC ACC ACC ACT CAG CTG TAC ACG GAC CGC ACG GAG AAG CTG 384 

135 REPEAT 1 



Arg Pro Glu Met Glu Gly Pro Gly Ser Phe 
AGG CCT GAG ATG GAG GGG CCC GGC AGC TTC 



Thr lie Phe Ala Pro Ser 

ACC ATC TTC GCC CCT AGC 432 



145 160 
Asn Glu Ala Trp Ala Ser Leu Pro Ala Glu Val Leu Asp Ser Leu Val 
AAC GAG GCC TGG GCC TCC TTG CCA GCT GAA GTG CTG GAC TCC CTG GTC 480 

170 

Ser Asn Val Asn lie Glu Leu Leu Asn Ala Leu Arg Tyr His Met Val 

AGC AAT GTC AAC ATT GAG CTG CTC AAT GCC CTC CGC TAC CAT ATG GTG 528 

185 

Gly Arg Arg Val Leu Thr Asp Glu Leu Lys His Gly Met Thr Leu Thr 

GGC AGG CGA GTC CTG ACT GAT GAG CTG AAA CAC GGC ATG ACC CTC ACC 576 

195 

Ser Met Tyr Gin Asn Ser Asn lie Gin He His His Tyr Pro Asn Gly 

TCT ATG TAC CAG AAT TCC AAC ATC CAG ATC CAC CAC TAT CCT AAT GGG 624 



Figure 5 ( i ) 
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210 220 
He Val Thr Val Asn Cys Ala Arg Leu Leu Lys Ala Asp His His Ala 
ATT GTA ACT GTG AAC TGT GCC CGG CTC CTG AAA GCC GAC CAC CAT GCA 672 

235 

Thr Asn Gly Val Val His Leu He Asp Lys Val He Ser Thr He Thr 

ACC AAC GGG GTG GTG CAC CTC ATC GAT AAG GTC ATC TCC ACC ATC ACC 720 

245 

Asn Asn He Gin Gin He He Glu He Glu Asp Thr Phe Glu Thr Leu 

AAC AAC ATC CAG CAG ATC ATT GAG ATC GAG GAC ACC TTT GAG ACC CTT 768 

260 270 
Arg Ala Ala Val Ala Ala Ser Gly Leu Asn Thr Met Leu Glu Gly Asn 
CGG GCT GCT GTG GCT GCA TCA GGG CTC AAC ACG ATG CTT GAA GGT AAC 816 

REPEAT 2 285 



Gly Gin 
GGC CAG 



Tyr Thr Leu Leu Ala Pro Thr Asn Glu Ala Phe Glu Lys He 

TAC ACG CTT TTG GCC CCG ACC AAT GAG GCC TTC GAG AAG ATC 864 



295 

Pro Ser Glu Thr Leu Asn Arg He Leu Gly Asp Pro Glu Ala Leu Arg 

CCT AGT GAG ACT TTG AAC CGT ATC CTG GGC GAC CCA GAA GCC CTG AGA 912 

310 320 
Asp Leu Leu Asn Asn His He Leu Lys Ser Ala Met: Cys Ala Glu Ala 
GAC CTG CTG AAC AAC CAC ATC TTG AAG TCA GCT ATG TGT GCT GAA GCC 960 

335 

He Val Ala Gly Leu Ser Val Glu Thr Leu Glu Gly Thr Thr Leu Glu 

ATC GTT GCG GGG CTG TCT GTA GAG ACC CTG GAG GGC ACG ACA CTG GAG 1008 

345 

Val Gly Cys Ser Gly Asp Met Leu Thr He Asn Gly Lys Ala He He 

GTG GGC TGC AGC GGG GAC ATG CTC ACT ATC AAC GGG AAG GCG ATC ATC 1056 

360 

Ser Asn Lys Asp He Leu Ala Thr Asn Gly Val He His Tyr He Asp 

TCC AAT AAA GAC ATC CTA GCC ACC AAC GGG GTG ATC CAC TAC ATT GAT, 1104 

370 

Glu Leu Leu He Pro Asp Ser Ala Lys Thr Leu Phe Glu Leu Ala Ala 

GAG CTA CTC ATC CCA GAC TCA GCC AAG ACA CTA TTT GAA TTG GCT GCA 1152 

385 395 

Glu Ser Asp Val Ser Thr Ala He Asp Leu Phe Arg Gin Ala Gly Leu 

GAG TCT GAT GTG TCC ACA GCC ATT GAC CTT TTC AGA CAA GCC GGC CTC 1200 

410 REPEAT 3 

Gly Asn His Leu Ser Gly Ser Glu Arg Leu | Thr Leu Leu Ala Pro Leu 

GGC AAT CAT CTC TCT GGA AGT GAG CGG TTG ACC CTC CTG GCT CCC CTG 1248 



Figure 5 ( i i ) 
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420 

Asn Ser Val Phe Lys Asp Gly Thr Pro Pro lie Asp Ala His Thr Arg 

AAT TCT GTA TTC AAA GAT GGA ACC CCT CCA ATT GAT GCC CAT ACA AGG 1296 

435 445 
Asn Leu Leu Arg Asn His He lie Lys Asp Gin Leu Ala Ser Lys Tyr 
AAT TTG CTT CGG AAC CAC ATA ATT AAA GAC CAG CTG GCC TCT AAG TAT 1344 



460 

Leu Tyr His Gly Gin Thr Leu Glu Thr Leu Gly Gly Lys Lys Leu Arg 
CTG TAC CAT GGA CAG ACC CTG GAA ACT CTG GGC GGC AAA AAA CTG AGA 



1392 



470 

Val Phe Val Tyr Arg Asn Ser Leu Cys lie Glu Asn Ser Cys lie Ala 

GTT TTT GTT TAT CGT AAT AGC CTC TGC ATT GAG AAC AGC TGC ATC GCG 1440 

485 495 
Ala His Asp Lys Arg Gly Arg Tyr Gly Thr Leu Phe Thr Met Asp Arg 
GCC CAC GAC AAG AGG GGG AGG TAC GGG ACC CTG TTC ACG ATG GAC CGG 1488 



510 

Val Leu Thr Pro Pro Met Gly Thr Val Met Asp Val Leu Lys Gly Asp 
GTG CTG ACC CCC CCA ATG GGG ACT GTC ATG GAT GTC CTG AAG GGA GAC 



1536 



520 

Asn Arg Phe Ser Met Leu Val Ala Ala lie Gin Ser Ala Gly Leu Thr 
AAT CGC TTT AGC ATG CTG GTA GCT GCC ATC CAG TCT GCA GGA CTG ACG 



1584 



535 



Glu Thr Leu Asn Arg Glu Gly Val Tyr 
GAG ACC CTC AAC CGG GAA GGA GTC TAC 



REPEAT 4 



Thr Val Phe Ala Pro Thr Asn 
ACA GTC TTT GCT CCC ACA AAT 



545 560 
Glu Ala Phe Arg Ala Leu Pro Pro Arg Glu Arg Ser Arg Leu Leu Gly 
GAA GCC TTC CGA GCC CTG CCA CCA AGA GAA CGG AGC AGA CTC TTG GGA 



1632 



1680 



570 

Asp Ala Lys Glu Leu Ala Asn lie Leu Lys Tyr His lie Gly Asp Glu 
GAT GCC AAG GAA CTT GCC AAC ATC CTG AAA TAC CAC ATT GGT GAT GAA 



1728 



585 

lie Leu Val Ser Gly Gly lie Gly Ala Leu Val Arg Leu Lys Ser Leu 
ATC CTG GTT AGC GGA GGC ATC GGG GCC CTG GTG CGG CTA AAG TCT CTC 



1776 



595 

Gin Gly Asp Lys Leu Glu Val Ser Leu Lys Asn Asn Val Val Ser Val 

CAA GGT GAC AAG CTG GAA GTC AGC TTG AAA AAC AAT GTG GTG AGT GTC 1824 

610 620 
Asn Lys Glu Pro Val Ala Glu Pro Asp lie Met Ala Thr Asn Gly Val 
AAC AAG GAG CCT GTT GCC GAG CCT GAC ATC ATG GCC ACA AAT GGC GTG 1872 



Figure 5 (iii) 
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635 

Val His Val Ila Thr Aen Val Leu Gin Pro Pro Ala Asn Arg Pro Gin 

GTC CAT GTC ATC ACC AAT GTT CTG CAG CCT CCA GCC AAC AGA CCT CAG 1920 



Glu 
GAA 



Arg Gly Asp 
AGA GGG GAT 



645 

Glu Leu Ala Asp Ser Ala Leu Glu He Phe Lys Gin 

GAA CTT GCA GAC TCT GCG CTT GAG ATC TTC AAA CAA 1968 



660 670 

Ala Ser Ala Phe Ser Arg Ala Ser Gin Arg Ser Val Arg Leu Ala Pro 

GCA TCA GCG TTT TCC AGG GCT TCC CAG AGG TCT GTG CGA CTA GCC CCT 2016 

Val Tyr Gin Lys Leu Leu Glu Arg Met Lys His *** 

GTC TAT CAA AAG TTA TTA GAG AGG ATG AAG CAT TAG CTTGAAGCACTACAG 2067 

GAGGAATGCACCACGGCAGCTCTCCGCCAATTTCTCTCAGATTTCCACAGAGACTGTTTGAATG 2131 

TTTTCAAAACCAAGTATCACACTTTAATGTACATGGGCCGCACCATAATGAGATGTGAGCCTTC 2195 

TGCATGTGGGGGAGGAGGGAGAGAGATGTACTTTTTAAATCATGTTCCCCCTAAACATGGCTGT 2259 

TAACCCACTGCATGCAGAAACTTGGATGTCACTGCCTGACATTC^CTT 2323 

CCAAATGTGGAATTGACTGCCTATGCCAAGTCCCTGGAAAAGGAGCTTCAGTATTGTGGGKSCT^ 2387 

ATAAAACATG AATCAAGCAATCCAGCCTCATGGGAAGTCCTGGCACAGTTTT^ 2451 

GCACAGCTGGAGAAATGGCATCATTATAAGCTATGAGTTGAAATGTTCTGTCAAATGTGTCTCA 2 5 15 

CATCTACACGTGGCTTGGAGGCTTTTATGGGGCCCT^ 2579 

AGCTTAGATTTCCCTATTGTGACAGAGCCATGGTGTGT 2643 

I I 

A 2644 
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